\section{Background}

\subsection{Physics of Electromigration}
The physical principle of EM is the motion of ions under the influence of electric field~\cite{Jing:jwm}. This motion changes the shape of thin metal wires under high current, and result in open-circuit failures or short-circuit failures. This effect is accumulative and higher temperature would exacerbate the problem. As atoms electromigrate, there is a depletion of material ``upstream'' and an accumulation ``downstream'' at sites of flux divergence. This can lead to void formation and accumulation, causing a large increase in electrical resistance and, as a result, dielectric cracking. Since EM manifests only during chip operation, test schemes applied before operation cannot detect this problem. Consequently, EM optimization should be considered at the design stage.

\subsection{Facts of Electromigration}
\subsubsection{Scaling Effect}
While technology scaling  improves circuit performance, it deteriorates the EM effect. Smaller feature size leads to higher current density. Suppose the scaling factor is z, then the newer generation is $\frac {1}{z}$ times the feature size of the old generation. The working current of a single transistor can be represented by the following approximation: 
\begin{equation}
I_{w}\approx \frac{9\varepsilon _{s}\mu_{n}AV_{d}^{2}}{8L^{3}}
\end{equation}
where $\varepsilon _{s}$ and $\mu_{n}$ are constants that are independent of technology. The area, $A$, scales to $\frac{1}{z^2}$. $V_{d}$ stands for supply voltage, which scales to $\frac{1}{z}$ (though it remains constant as technology scales below 45nm); $L$ is the channel length, which also scales to $\frac{1}{z}$. The current for a single transistor scales to $\frac{1}{z}$, and the number of transistors fitting in it scales to ${z}$ for a fixed length standard cell power rail. Thus the working current remains unchanged for power rails under scaling.
The physical principle of EM can be modeled by \textit{Black Equation}~\cite{Jing:BlackEM} as follows (MTTF is used to characterize the severity of EM): 
\begin{equation}
MTTF_{EM}\propto J^{-n}\times e^{\frac{E_{aEM}}{kT}}
\end{equation}
As the cross section area of metal wire scale to $\frac{1}{z^2}$ and $n$ is usually around 1, technology scaling  will make EM $z^2$ times worse. Metal 1 does not scale as large as before, but the supply voltage is almost constant these days. The real EM problem can be more severe than $z^2$.

\subsubsection{Temperature}
According to equation (2), higher on-chip temperature will exponentially exacerbate the EM issue. Smaller feature size results in higher power density, which translates into worse thermal problems. 

\subsubsection{Wire Width}
Wire width not only determines the current density, but also affects the geometry arrangement of the polycrystalline metal grains~\cite{4633651}~\cite{4859008}. Given the same current density and temperature, a wire is most vulnerable to EM when the metal line width is about 1-2 times the metal grain size~\cite{4859008}. Such grain size depends on the interconnection metal. Therefore, tools and technology libraries should be tuned to prevent such wire width in the design stage. When the wire width is below one grain size, the MTTF increases significantly with the decrease of metal line width~\cite{4859008}. %The polycrystalline Cu thin film line size should not

\subsection{EM Effects on Power Supply Network}
Different parts of a power network have their own EM severeness. An example scheme of a power delivery network is shown in Figure~\ref{fig:metal}. The power grid uses very wide high layer metals for whole chip power delivery. For each small block, the standard cell power rails convey current to all transistors (Figure~\ref{fig:metal}b). These rails usually use minimum width metal-1 layer, and the current density on them are significantly higher than on power grids. The EM time to failure is found to increase with line width for long wires~\cite{283282}, which is usually the case for power rails and grids. On the other hand, if the metal length is under 10 $um$, narrow wires EM time to failure was observed to be long~\cite{Jing:taiwanEM}. The power supply wires inside the standard cells meet this length requirement and are safe. After considering all the on-chip wires, the standard cell power rail has the highest risk of the EM failure. The middle of the power rail may have current direction change due to different input pattern. Such dynamic change never happens at the two ends of power rails, where power rails are connected to the power grid (red shadow in Figure~\ref{fig:metal}c)

\begin{figure}[t]
\centering
\includegraphics[width=0.45\textwidth]{figure/metal}
\caption{\small{The hierarchical on chip power delivery network scheme}}
\label{fig:metal}
\end{figure}



\subsection{Healing Effect}

EM happens when long durations of uni-directional current applied. AC stress can provide healing effect in metal wires ~\cite{5232735}~\cite{Arnaud1999773}~\cite{Tao1998295}~\cite{4558988}. The alternating charge flow helps in reducing the ion accumulation in any particular direction. Its time-to-failure is typically larger than that of DC stress. The experimental results of the time-to-failure under AC stress was discussed by Tao et al.~\cite{Jing:add}. Their result showed that uni-directional current will increase the resistance of metal wires. If opposite directional current is applied on wires, some but not all of the damage can be healed. The healing effect depends on the AC frequency. Given $\left |{\bar{J}}\right |^{m}$=$J_{+}-J_{-}$, where $J_{+}$ and $J_{-}$ are the current densities in opposite directions, the EM MTTF of a wire can be expressed as
$\gamma(1-\eta)\left |{\bar{J}}\right |^{m}$~\cite{260787}. In the AC mode, $\eta$ changes with frequency. When the frequency is below a threshold, the AC MTTF is close to the DC case; between this threshold and another higher threshold, the AC MTTF increases with frequency; after the frequency reaches the higher threshold, the AC MTTF is stable with slight increase. These two thresholds depend on the operation temperature and the material used.

In power supply networks, healing effect has the best performance with fully balanced AC stress. However, such ideal balance is challenging to realize in practice. In the next section, we will discuss our MTTF estimation model for the AC plus DC situation.

%\subsection{Related Work}

%Various earlier observation~\cite{260787}~\cite{5232735}~\cite{36348}~\cite{Jing:add}~\cite{Jing:microEM} shows that driving similar amounts of current in the opposite direction can heal the EM damage. Thus, we propose a self-healing power supply network design in this work. 



